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Abstract 

Compressive sensing has become a powerful addition to uncertainty quantification in recent years. This paper 
identifies new bases for random variables through linear mappings such that the representation of the quantity of 
interest is more sparse with new basis functions associated with the new random variables. This sparsity increases 
both the efficiency and accuracy of the compressive sensing-based uncertainty quantification method. Specifically, 
we consider rotation-based linear mappings which are determined iteratively for Hermite polynomial expansions. 
We demonstrate the effectiveness of the new method with applications in solving stochastic partial differential 
equations and high-dimensional ((9(100)) problems. 

Keywords: uncertainty quantification, generalized polynomial chaos, compressive sensing, iterative rotations, 
active subspace, high dimensions. 
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1. Introduction 

Uncertainty quantification (UQ) plays an important role in constructing computational models as it helps to 
understand the influence of uncertainties on the quantity of interest. In this paper, we study parametric uncertainty, 
which treats some of the parameters as random variables. Let (fl, T, P) be a complete probability space, where 
Q is the event space and P is a probability measure on the cr-field p". We consider a system depending on a 
^f-dimensional random vector ^(cu) = ■ • ■ ,^d(ci)V, where to is an event in Q. For simplicity, we 

denote f,(cu) as f,. We aim to approximate the quantity of interest u(^) with a generalized polynomial chaos (gPC) 
expansion [1,2]: 

N 

+ ( 1 . 1 ) 

H=1 

where e is the truncation error, V is a positive integer, c„ are coefficients, i/r„ are multivariate polynomials which 
are orthonormal with respect to the distribution of 

f ( 1 . 2 ) 

JkA 

where p(^) is the probability distribution function (PDF) of ^ and 6ij is the Kronecker delta. The approximation 
converges in the L 2 sense as N increases if u is in the Hilbert space associated with the measure of ^ (i.e., the 
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weight of the inner product is the PDF of [2, 3, 4]. Stochastic Galerkin and probabilistic collocation are 
two popular methods [1, 2, 5, 6, 7, 8] used to approximate the gPC coefficients c - (ci,C 2 , - ■ ■ , cnV- Stochastic 
collocation starts by generating samples of input 1,2, • • • ,M based on p(^). Next, the computational model 

is calculated for each to obtain corresponding samples of the output Finally, c are approximated 

based on and Note that in many practical problems, it is very costly to obtain and, due to the limited 
computational sources, we will often have M < N or even M N. The smaller number of samples than basis 
functions implies that the following linear system is under-determined: 

- u + e, (1.3) 

where u - ,u^Y is the vector of output samples, is an M x N matrix with T',j = 

e - - is a vector of error samples with The compressive sensing method is effective 

at solving this type of under-determined problem when c is sparse [9, 10, 11, 12] and recent studies have applied 
this approach to uncertainty quantification (UQ) problems [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]. 

Several useful approaches have been developed to enhance the efficiency of solving Eq. (1.3) in UQ appli¬ 
cations. First, re-weighted t\ minimization assigns a weight to each c„ and solves a weighted t\ minimization 
problem to enhance the sparsity [25]. The weights can be estimated in a priori [18, 26] or, for more general 
cases, can be obtained iteratively [15, 17]. Second, better sampling strategies can be used, such as minimizing the 
mutual coherence [27, 20]. Third, Bayesian compressive sensing method provides the posterior distribution of the 
coefficients [23, 16]. Finally, adaptive basis selection selects basis functions to enhance the efficiency instead of 
fixing the basis functions at the beginning [22]. Recently, we propose an approach [17] to enhance the sparsity of 
c through the rotation of the random vector ^ to a new random vector j/, where the rotation operator is determined 
by the sorted variability directions of the quantity of interest u based on the active subspace method [28]. 

In this work, we aim to extend our previous work [17] and consider the specific case where the system depends 
on i.i.d. Gaussian random variables; i.e., ^ ~ Af(0, /) where 0 is a ^f-dimensional zero vector and / is a xc/ identity 
matrix. This assumption appears in a wide range of physics and engineering problems. We aim to find a mapping 
g : which maps ^ to a new set of i.i.d. Gaussian random variables j/ = (?7i, ? 72 , • • • , qdY such that the 

gPC expansion of u with respect to // is sparser. In other words, 

N N 

M(^) ^ = E ^«'A«(j/(^)) ^ uiniOl (1-4) 

n=l n=l 

where are orthonormal polynomials associated with the new random vector i] and c„ are the corresponding 
coefficients. Note that i/r„ = if/„ since // ~ Af(0, /). We intend to find the set c - (ci,C 2 , • • • ,cnY which is 
sparser than c while preserving the properties of matrix Y (with 'T,^ = close to those of ^ to improve the 

efficiency of the compressive sensing method. To accomplish this, we will use a linear mapping, based on the idea 
of active subspaces [28], to obtain i] as first proposed in [17]. Unlike our previous work, we build this mapping 
iteratively in order to obtain a sparser c and improve the efficiency of the gPC approximation by compressive 
sensing. We also provide the analytical form of the “gradient matrix” (see Eq.(3.3)) to avoid estimating it with 
Monte Carlo methods. Our method is applicable for both {q and minimization problems. Especially, for the 
latter, we can also integrate the present method with re-weighted {i minimization method to further reduce the 

2 


error. We demonstrate that, compared with the standard compressive sensing methods, our approach reduces the 
relative L 2 error of the gPC approximation. 


2. Brief review of the compressive sensing-based gPC method 

2.1. Hermite polynomial chaos expansions 

In this paper we study systems relying on ^f-dimensional Gaussian random vector ^ ~ Af(0, /). Therefore, 
the gPC basis functions are constructed by tensor products of univariate orthonormal Hermite polynomials. For a 
multi-index a - (ai,a 2 , • • • , ad), a,- e N U {0), we set 




( 2 . 1 ) 


For two different multi-indices or; = ((ai )^, (ai )^, • • • , (ai)^) and aj = ((aj ),, (aj )^, • • • , (aj)^), we have the property 

^ — 'Jq-jO'j — ^(ai)j(aj)j^(ai)^(aj)^ ' ' ' j (2.2) 


where 


p(0 


sj2n 


exp 




(2.3) 


For simplicity, we denote as i/',(^). 


2.2. Compressive sensing 

The vector c in Eq. (1.3) can be approximated by solving the following optimization problem: 

(Ph.id ■ argmin||c||/„ subject to || - h ||2 < e, (2.4) 

C 

where e - ||fi ||2 and h is typically set as 0 or 1. For h - Q {(q minimization problem), the greedy Orthogonal 
Matching Pursuit (OMP) algorithm [29, 12] can be applied; for h = \ {{\ minimization problem), convex opti¬ 
mization methods are directly applicable [30]. As pointed out in [12], OMP is very efhcient - when it works - 
but convergence to a sparse solution is not always guaranteed. There are specific cases where a sparse solution is 
possible while OMP yields a dense one. Since both the OMP and minimization approaches are widely used, 
we will demonstrate the effectiveness of our new method for both methods. 

Next, we introduce the concept of sparsity as it is critical in the error estimates for solving the under-determined 
system Eq. (1.3) with the compressive sensing method. The Iq “norm” of vectors: = (xi, X 2 , ■ • • , xn) is dehned as 
the number of its non-zeros entries [31, 9, 12] 

||x||o“#(/:x,-4 0] (2.5) 

and norm is defined as the sum of the absolute value of its entries: 

N 

^ 2 . 6 ) 

n-1 

X is called s-sparse if ||x||o < s, and x is considered a sparse vector if s <& N. Eew practical systems have a truly 
sparse gPC coefhcients c. However, in many cases, the c are compressible, i.e., only a few entries make significant 
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contribution to its (i norm. In subsequent discussion, we relax the definition of “sparse”: x is considered sparse if 
IIj: - jCjIIi is small for s ^ N. Here jCj is defined as the best s-sparse approximation one could obtain if one knew 
exactly the locations and amplitudes of the i-largest entries of x, i.e., Xs is the vector x with all but the i-largest 
entries set to zero [11]. 

The error bound for solving Eq. (1.3) with minimization requires definition of the restricted isometry prop¬ 
erty (RIP) constant [32]. For each integer s = 1,2, • ■ ■, the isometry constant 6 s of a matrix 0 is defined as the 
smallest number such that 

(l-«IWl2<ll^^ll2^(l+5.)IWl2 (2.7) 

holds for all s-sparse vectors x. With some restrictions, Candes et al. showed x can be stably reconstructed [11]. 
Assume that the matrix W satisfies 62 s < V2 - 1, and ||e ||2 < e, then solution c to (Pi.e) obeys 

||c-c||2<Cie + C2^^^^^, (2.8) 

s/s 

where Ci and C 2 are constants, c is the exact vector we aim to approximate and c is the solution of (Pi,f). This 
result implies that the upper bound of the error is related to the truncation error and the sparsity of c, which is 
indicated in the first and second terms on the right hand side of Eq. (2.8), respectively. 

The re-weighted t\ minimization approach is an improvement of the t\ minimization method, which enhances 
the accuracy of estimating c [25]. The re-weighted t\ approach solves the following optimization problem: 

(Pf^) : argmin||Pl/c||i, subject to IHPc - h ||2 < e, (2.9) 

' C 

where IV is a diagonal matrix: 1/1/ = diag(wi, W 2 , • • • , wn). Clearly, (Pi,e) can be considered as a special case of 
{PYJ by setting W - I. The elements w, of the diagonal matrix can be estimated based on analysis of u as in 
Peng et al. [18], or be estimated iteratively [25, 15]. More precisely, for each iteration I, (PYJ is solved to obtain 
and then - l/(|c*^'| H- d) for the next step. The parameter 5 > 0 is introduced to provide stability and 

to ensure that a zero-valued component in does not prohibit a nonzero estimate at the next step. In Candes et 
al. [25], the authors suggest two to three iterations of this procedure. Subsequent analytical work [33] provides an 
error bound for each iteration as well as the limit of computing c with re-weighted minimization. The form is 
similar to Eq. (2.8) with different constants. 

In practice, the error term e is not known a priori, hence cross-validation is needed to estimate it. One such 
algorithm is [13] summarized in Algorithm 1 : 

Algorithm 1 Cross-validation to estimate the error e 

1: Divide the M output samples to reconstruction {Ur) and My validation («„) samples and divide the mea¬ 
surement matrix 0 correspondingly into U/. and U'y. 

2 : Choose multiple values for such that the exact error || - UrWi of the reconstruction samples is within the 

range of values. 

3: For each ey, solve (P/,,f) with Uy and Wy to obtain c, then compute - || UtyC - u^Wi- 
4: Find the minimum value of and its corresponding e,. Set e = s/M/Myey. 


We omit the review of the theoretical results for the OMP as well as its variants, and refer interested readers to 
the literature [12, 34, 35]. Similar to the approach, the error estimate for OMP includes a term which depends 
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on the sparsity of c. This is a critical point that motivates us to propose the new method described in the next 
section. 

2.3. Compressive sensing-based gPC methods 

Given M samples of the quantity of interest u is approximated by a gPC expansion as in Eq. (1.1): 

N 

^ ( 2 . 10 ) 

«=1 

which can be rewritten as Eq. (1.3). A typical approach to compressive sensing based-gPC is summarized in 
Algorithm 2. 

Algorithm 2 Compressive sensing-based gPC 
1: Generate input samples ^‘^,q - 1,2, • • • ,M based on the distribution of 

2: Generate output samples by solving the complete model; e.g., running simulations, solvers, etc. 

3: Select gPC basis functions associated with ^ and then generate the measurement matrix Ut by setting 

'Pp - <A,/(^'). 

4: Solve the optimization problem (P/,,f): 

argmin ||c||/,, subject to||’/'c - m||2 < e, 

C 

where /r = 0 or 1, m = (m', m^, • • • , m“)^, and e is obtained by cross-validation. If the re-weighted method 
is employed, solve instead. 

5: Set c - c and construct gPC expansion as u(^) ^ CniAn)^)- 


Note that the RIP condition in Theorem 2.2 is sufficient but not necessary; furthermore, it is difficult to obtain 
the exact RIP constant in practical problems. A more tractable property of the measurement matrix for calculation 
is the mutual coherence [12]: 




max -, 

l<j,k<NJi=k 


( 2 . 11 ) 


where *Fy and *P<. are columns of W. In general, a measurement matrix with smaller mutual coherence is better 
able to recover a sparse solution with the compressive sensing method. Note that E |i/f,(^)^j(^)| = dij since 
are orthonormal polynomials. Therefore, asymptotically, p( W) converges to zeros according to the strong law of 
large numbers. 

In the next section, we will demonstrate that our new method increases the sparsity of c without changing 
p significantly, and hence, our method is able to improve the accuracy of the compressive sensing-based gPC 
method. 


3. Iterative rotations for increasing sparsity 

In this section, we provide a heuristic method to identify the rotation matrix by computing the eigenvalue 
decomposition of a gradient matrix G. The rotation increases the sparsity of the gPC expansions of quantity of 
interest u with respect to a new set of random variables. The rotation procedure is applied iteratively to achieve 
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a target sparsity level. The enhancement of the sparsity decreases the second term (sparsity-induced error) on 
the right hand side of Eq. (2.8). For the cases where this sparsity-induced error dominates the total error of the 
compressive sensing method, our new approach improves the overall accuracy. 

From Eq. (2.8), we notice that if c is exactly sparse (i.e., c = c ,. for some s* <Si N) and if the RIP condition 
is satisfied for s > s*, then ||c - c^H = 0. Therefore, the upper bound of the error only depends on e. In practical 
problems, c is usually not exactly sparse. But if the truncation error e is sufficiently small, then the second term on 
the right hand side of Eq. (2.8) dominates the upper bound of the error. Hence, in order to improve the accuracy 
of the gPC expansion, we need to decrease ||c - Cj||i/ V*- However, once the gPC basis functions are selected, c, 
and therefore ||c - Cj||i/ ■\fs, are fixed. A natural way to enhance the sparsity of c is to find another set of random 
variables J; = (?7i, 772 , • • • , which depend on ^ such that the vector c, which are the gPC coefficients of u with 
respect to i], is sparser. In order words, our goal is to seek j/(^) with 

N N 

m (^) ~ Yj ^ Y ~ u(n(0)^ 

«=1 n=l 

such that ||c - Cs||i < ||c - Cs||i. Note that N does not necessarily equal N and d can be different from d. We will 
denote the mapping from ^ to // as g : M"' 

There are several behaviors that our proposed approach must exhibit. 

• The PDF oft] must be computed efficiently. The first step of generating a new gPC expansion is to obtain 
the PDF of i]. Hence, if g is complicated, the PDF of i] will be difficult to obtain. Even if g is a simple 
function, it can still be difficult to obtain an accurate PDF if the dimension is large. 

• The new gPC basis functions associated with t] must be computed efficiently. If the 77, are independent, then 
the new gPC basis can be constructed as the tensor product of univariate basis functions in each dimension. 
Although this is not necessary, it will make the construction of new basis functions easier as it avoids the 
computation of high-dimensional integrals. 

• The properties of the measurement matrix must be preserved. Clearly, the measurement matrix changes as 
we introduce new random variables and new basis functions. Even though we may construct a very sparse 
c, if the key properties of the measurement matrix are altered too much (e.g., the RIP constant or mutual 
coherence increases dramatically), we may be unable to obtain an accurate result with the compressive 
sensing method. 

• No additional output samples must be needed. In particular, the existing output samples u‘t,q - 1,2,-- - ,M 
should be sufficient. This is especially important for the cases when the model (simulation or deterministic 
solver) is very costly to compute. 

In this work, we focus on the special case of normal distributions ^ ~ Af(0, /); hence, the i/r, are constructed as 
the tensor product of univariate orthonormal Hermite polynomials as shown in Eq. (2.1). We aim to find a linear 
mapping g : R"' such that 

7/-g(^) = A^, (3.1) 


6 


where A is an orthonormal matrix. 

If we find this matrix A then all of the aforementioned behaviors can be obtained. We know that i] ~ Af(0, /) 
since AA^ - I. Therefore, the samples of i/ can be obtained as = A^'^, where are generated at the beginning 
(Step 1 in Algorithm 2). Since rj ~ N{0,1) we can set and no additional computation is needed. The 

difference between W and W is that the latter is constructed by evaluating orthonormal Hermite polynomials at 
another set of samples of i.i.d. Gaussian random variables; i.e., '1',^ = ijfjin') - Therefore, the mutual 

coherence of W converges to 0 as that of and the difference between p( W) and yu( W) is the deviation 

of the Monte Carlo numerical integral from the exact value. No additional samples are required since the 
improvement of accuracy is achieved by enhancing the sparsity of gPC coefficients. 

Given the Hermite polynomials defined above, we have a new expansion for u: 

N N 

m (^) ~ ^ Cn^ni^) = ^ Cn^PniH)) ~ «(»/) ( 3 . 2 ) 

n=l n=l 


with c sparser than c. In order to obtain the A, we adopt the active subspace approach [28]. We first define the 
“gradient matrix”: 

G “ E {Vm(^) ■ Vm(^)^} = UAU'^, UU'^ = /, (3.3) 

where G is symmetric, Vm(^) = {duld^i,duld^ 2 , • • • ,dufd^dV is a column vector, U - {U\, U 2 , • • • , Ud) is an 
orthonormal matrix consisting of eigenvectors Uj, and A = dmg{Ai, A 2 , • • • , Ad) with Ti > /i 2 > • ■ ■ is a diagonal 
matrix with elements representing decreasing variation of the system along the respective eigenvectors. We choose 
A - which, as a unitary matrix, defines a rotation in and the linear mapping g in Eq. (3.1) projects ^ on 
the eigenvectors Ui . Consequently, when the differences between |/l,| are large, g helps to concentrate the 
dependence of u primarily on the first few new random variables 77 , due to the larger variation of u along the 
directions of the corresponding eigenvectors. Therefore, we obtain a sparser c than c. We note that this approach 
of constructing G is similar to the method of outer product gradients (OPGs) in statistics [36, 37]. The information 
of the gradient of u is also utilized to improve the efficiency of compressive sensing in the gradient-enhanced 
method [38, 39, 40, 22,24]. 

Since u is not known a priori, we replace it with its gPC expansion Ug = Cn^niO- In prior work by 
Constantine and others [28, 17], the expectation is obtained by taking the average of the Monte Carlo results. In 
the current work, we compute G differently: after obtaining c with compressive sensing method, we construct a 
gPC approximation Ug to u and approximate G accordingly: 


G 


( 

f ^ 

\ 

e|v 

V c„iA„(^) 

1 

U=1 

7 


^ Cn'ipn'i^) 




(3.4) 
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The entries of G can be approximated as; 


f N 


]d^i 


\ r, { N 

o 




= E< 


f N 

Z' 


VV«=i 
N N 


dxjJniO 
' d^i 


Zj^"' 

\n'-\ 




J 


0.5) 


«=1 «'=1 
= KjjC, 


dl//niO dljJn'iO 
d^i ' d^j 


where Ki , is a “stiffness” matrix with entries 




(3.6) 


Notice that Kjj can be precomputed since {i)/,} are normalized Hermite polynomials (see Appendix for details). G 
is a d X d matrix, where d is the number of random variables in the system. Since G is a symmetric matrix, we 
only need to compute d{d + l)/2 of its entries. Furthermore, unlike the active subspace method, which focuses on 
the subspace of K.^*, we keep the dimension and set of basis functions unchanged. 

The entire iterative procedure is summarized in Algorithm 3. This algorithm adds post-processing steps to 


Algorithm 3 Compressive sensing method with iterative rotations 
1 : For given random vector ^ and quantity of interest u, run Algorithm 2 to obtain approximated gPC coefficients 

c. 

2: Set counter I — 0, = c. 

3: Construct G^^' with according to Eq. (3.5). Then decompose as 


G('+i) = = /. 


4: Define and compute samples - (G^^'^'*)^(i/®)^, q- 1,2, • • ■ ,M. Also, construct 

the new measurement matrix with = ip 
5: Solve the optimization problem (P/, f(;+i)): 

argmin||c||/„ subject to|| - lilb < 

C 

and set - g. If reweight (\ method is employed, solve (PJ'^j+d) instead. 

6 : Set / = / -I- 1. If I IIG® 111 — d\< 6, where the threshold 6 is a positive real number, then stop. Otherwise, go to 
Step 3. 


Algorithm 2, which is designed to increase the accuracy of compressive sensing based gPC method. In Step 
5, we use notation since the estimated error at iteration I + I may be different from e. According to our 
numerical experiments (see Sec. 4), it is usually sufficient to test two or three different values on [e/5, e] in the 
cross-validation procedure (see Algorithm 1) to obtain 

In Algorithm 3, we propose a terminating condition based on the norm of the rotation matrix in each 
iteration; 5(G®) Yfi=\ IIG^^Hi. If is the identity matrix or a permutation matrix, we need no further 
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iterations, and S{U^^^) - d. Otherwise, S(U^^^) > d since = 1 and 



|2 


/ 


Vy=i / 



[+2 |([/®),||(t/'\|>i. 


l<j,k<dj¥=k 


(3.7) 


Hence, one may set a threshold 6 and the iteration stops when \S{U^'^) - d\ < 6. Empirically, 6 can be set around 
Q.ld ~ Q.2d. More sophisticated terminating conditions (e.g., sparsity or e estimates) are also possible. A 
rigorous theoretical analysis on the convergence behavior is not available at this time. The criterion presented 
here provides an approach to estimate, to some extend, whether our method converges. Empirically, when this 
stopping criterion is satisfied, additional iterations will not improve the accuracy significantly. We also note 
that the simplest terminating condition in Step 6 is to set a maximum iteration steps L. Based on our numerical 
examples in Sec. 4, this simple condition can also be useful. In general, the efficiency of our method depends on 
the intrinsic sparsity of the system, i.e., whether the system mainly relies on a small amount of subspaces. The 
fewer subspaces the system depends on, the better performance our method exhibits. Otherwise, this method is 
less effective, e.g., an extreme case is u(^) - Yfi=i f?’ for which the iterative rotations based on current framework 
does not help. 

4. Numerical results 

In this section, we present five numerical examples to demonstrate the effectiveness of our new method. The 
accuracies of different methods are measured by the relative L 2 error: (||m - Mg|| 2 )/I|M|| 2 , where Ug is the Hermite 
polynomial expansion of u. The integral 



(4.1) 


(and \\u - Ug\\ 2 ) is approximated with a high-level sparse grid method which is based on one-dimensional Gauss- 
Hermite quadrature and the Smolyak structure [41]. The term “level” p means that the algebraic accuracy of the 
sparse grid method is 2p - 1. We use P to denote the truncation order, which implies that Hermite polynomials up 
to order P are included in expansion Ug. Hence, the number of unknowns can be computed as N - {)■ 

The relative errors we present in this section are obtained from 100 independent replicates for each sample size 
M. Eor example, we generate 100 independent sets of input samples ^‘^,q - 1,2, • • • , M, compute 100 different 
relative errors, and then report the average of these error samples. To investigate the effectiveness of the increasing 
of output samples, we set the jc-axis in our figures as the ratio MjN which is the fraction of available data with 
respect to number of unknowns. We use MATLAB package SPGLl [42, 43] to solve (Pi,e) as well as (Pj'"^) and 
use SparseLab [44] for the OMP method. If not otherwise indicated, results are obtained with L = 3 iterations in 
Step 6 of Algorithm 3. 

4.1. Example function with equally important random variables 
Consider the following function 



(4.2) 
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Figure 1: Results for the example function with equally important random variables. (Left) Comparison with the method, “o”; ^i, 
rotated with 3 iterations, rotated with 6 iterations, “o”: rotated with 9 iterations, re-weighted €\. (Right) Comparison with 
the OMP method, “o”: OMP, rotated OMP with 3 iterations, rotated OMP with 6 iterations, rotated OMP with 9 iterations. 
These calculations were performed with dimension d = \2 and the number of unknowns N = 455. 


where all are equally important. In this case, adaptive methods that build the surrogate model hierarchically 
based on the importance of (e.g., [45,46,47,48]) may not be efficient. A simple rotation matrix for this example 
has the form 


^- 1/2 ,,, ^^ 1 / 2 ) 


A = 


A 


\ / 

where A k z. {d — \) x d matrix chosen to ensure that A is orthonormal. Given this choice for A, then rfx - 
(2f=i u has a very simple representation: 

m(^) = m(j/) = d^^\ + 0.25dj]\ + OmSd^l^j]]. 

Therefore, as we keep the set of the basis functions unchanged, all the Hermite polynomials not related to 771 make 
no contribution to the expansion, which implies that we obtain a very sparse representation of u. Unfortunately, 
the optimal structure is not known a priori, hence, the standard compressive sensing cannot take advantage of it. 

In this test, we set d - 12 (hence, N = 455 for P - 3) and demonstrate the effectiveness of our new method. 
The integrals for calculating the L 2 error are computed by a level 4 sparse grid method, hence they are exact. 
The relative error of minimization and OMP are presented in Fig. 1. Clearly, the standard minimization and 
OMP are not effective as the relative error is close to 100% even when M/N > 0.4. Also, the re-weighted does 
not help in this case. However, our new iterative rotation demonstrates much better accuracy, especially when 
M is large. As demonstrated in Fig. 2 the iterative rotation creates a much sparser representation of u, hence the 
efficiency of compressive sensing method is substantially enhanced. We notice that the accuracy increases as more 
iterations are included. However, the improvement from 6 iterations to 9 iterations is less significant as that from 
3 iterations to 6 iterations, especially for the OMP-based iterative rotation method. In general, the improvement 
afforded by iterative rotation becomes small after 3 iterations. 
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Figure 2: Magnitude of the gPC coefficients for the example function with equally important random variables. (Left) Magnitude of (Right) 
Magnitude of of a randomly chosen replicate computed by rotated with 9 iterations and M = 180 (MjN « 0.4). These calculations were 
performed with dimension d = 12 and the number of unknowns N = 455. 


This contrived example demonstrates that our new method is capable of enhancing the sparsity of the Hermite 
polynomial expansion, even with a very inaccurate in Step 2 of Algorithm 3 when other methods fail. 

4.2. Example function with high compressibility 
Consider the following function; 

p N 

M(^) = 2 Calpai^) = ^ ^ = (fl, ft, • • • , ft), (4.3) 

|q'|= 0 n=l 

where, are normalized multivariate Hermite polynomials, d - \2,P - 3, N - 455, and the coefficients c„ are 
chose as uniformly distributed random numbers, 

Cn^fln^\ ^~'W[-1,1]. (4.4) 

For this example, we generate N samples of f: then divide them by n' = 1,2, •• • ,N to obtain 

a random “compressible signal” c. The integrals for the relative error are computed by a level-4 sparse grid 
method and are therefore exact. Figure 3 shows the the relative error with different numbers of iterations (1-3) 
for the t\ minimization and OMP methods. Our new iterative rotation method improves the accuracy of the gPC 
approximation for both methods. As before, benefit of increased iterations drops sharply near L - 3. Therefore, 
in the remainder of this paper we use L - 3 iterations unless otherwise noted. 

Figure 4 shows results obtained by applying our iterative rotation technique to the re-weighted approach 
using L - 3 iterations. The results for the iterative rotation approach for OMP are also presented in Figure 4. For 
all methods, introduction of the iterative rotation approach improves the results. A comparison of the sparsity of 
c and c is presented in Fig. (5). The main improvement is that the number of coefficients with magnitude larger 
than 0.01 is decreased. Also, |c,j| cluster around the line |c,j| = as we set them in this way, while many |c„| 

are much below this line especially when n is large. 
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M/N M/N 

Figure 3: Results for the example function with high compressibility. Relative L 2 error for different numbers of iterations for minimization 
(left) and the OMP method (right), “o”: standard (left) or standard OMP (right), 1 iteration, 2 iterations, “o”: 3 iterations. These 
calculations were performed with dimension d = 12 and number of unknowns N = 455. 




M/N M/N 


Figure 4: Results for the example function with high compressibility. (Left) Compaiison with £i methods, “o”: standard £\, re-weighted 
rotated £\, “o”: re-weighted and iteratively rotated ^ 1 . (Right) Compaiison with OMP methods, “o”: OMP, rotated OMP. These 
calculations were performed with dimension d = \2 and number of unknowns N = 455. 


4.3. Example elliptic differential equation 

Next we consider a one-dimensional elliptic differential equation with a random high-order coefficient: 


dx 




du{x', 


dx 

u(0) = m(1 ) = 0, 


= 1 , 6 ( 0 , 1 ) 


(4.5) 


where a(x; ^) is a log-normal random field based on Karhunen-Loeve (KL) expansion: 

f d \ 


a(x; ^) = ao(x) + exp 


0 “^ yEi<Piix)^i 


\ i=l 
d 


(4.6) 


where are i.i.d. standard Gaussian random variables, and {0/(ji:))/_j are the largest eigenvalues and 

corresponding eigenfunctions of the exponential covariance kernel: 


C(x, x') = exp 


Ir 


(4.7) 
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Figure 5: Magnitude of the gPC coefficients for example function with high compressibility. (Left) Magnitude of c„. (Right) Magnitude of 
Cn of a randomly chosen replicate computed by re-weighted and iteratively rotated £[ with M = 180 {MIN » 0.4). These calculations were 
performed with dimension d = \2 and the number of unknowns N = 455. 


In the KL expansion, A, denotes the eigenvalue of the covariance kernel C(x, x') instead of entries of A in Eq. (3.5). 
The value of A, and the analytical expressions for (pi are available in the literature [49]. In this example, we set 
ao(x) s 0.1, cr = 0.5, Ic - 0.2 and d - 15. With this setting, 4, > 0.93 4,-. For each input sample a 

and u only depend on x and the solution of the deterministic elliptic equation can be obtained as [15]; 


u(x) — u(0) ■ 


f 


a(0)u(0y - y 

a(y) 


dy. 


By imposing the boundary condition m(0) = m( 1) = 0, we can compute a(0)M(0)' as 




(4.8) 


(4.9) 


The integrals in Eqs. (4.9) and (4.8) must be obtained by highly accurate numerical integration. For this example, 
we choose the quantity of interest to be m(x;^) at x = 0.35. We aim to build a 3rd-order Hermite polynomial 
expansion which includes A = 816 basis functions. The relative error is approximated by a level-6 sparse grid 
method. Figure 6 shows that accuracy of the re-weighted (3 iterations) and the iteratively rotated £i {L — 3 
iterations) method are very close in this case. Figure 6 shows the results of the iterative rotation process applied 
to the OMP method. In all cases, the incorporation of iterative rotation improves the performance of the other 
methods. A comparison of c and c are presented in Fig. 7, which shows the improvement of the sparsity in the 
similar manner as in example function with high compressibility in Sec. 4.2. 


4.4. Example Korteweg-de Vries equation 

As an example application of our new method to a more complicated and nonlinear differential equation, we 
consider the Korteweg-de Vries (KdV) equation with time-dependent additive noise [50]: 


Defining 


u,{x, t; g) - 6u{x, t; g)ih{x, t; g) + u,„^(x, t; g) = f(t; g), x 6 (-oo, oo), 
u{x, 0; g) - -2 sech^(.r). 


W(f,g) 


f /(y;^)dy, 

Jo 
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(4.10) 


(4.11) 







Figure 6: Results for the example elliptic differential equation. (Left) Comparison with {\ methods, “o”: standard re-weighted i\, 

rotated l\, “o”: re-weighted and iteratively rotated i\. (Right) Comparison with OMP methods, “o”: standard OMP, rotated OMP. 
These calculations were performed with dimension = 15 and the number of unknowns N = 816. 



Figure 7: Magnitude of the gPC coefficients for example elliptic differential equation. (Left) Magnitude of (Right) Magnitude of Cn of a 
randomly chosen replicate computed by re-weighted and iteratively rotated with M = 240 {MjN ^ 0.3). These calculations were perfonned 
with dimension d = \5 and the number of unknowns N = 816. 


the analytical solution of Eq. (4.10) is 

-2 sech^^x - 4t + 6 J" lE(z;^)dzj. 

We model f(t; as a Gaussian random field represented by the following KL expansion: 

d 

i=\ 


(4.12) 


(4.13) 


where cr is a constant and {/I/, <^,(f))f_i are eigenpairs of the exponential covariance kernel as in Eqs. (4.6) and 
(4.7), respectively. In this problem, we set 4 = 0.25 and d - 10 (2f_i 4, > 0.96 4,). In this case, the exact 

one-soliton solution is 


d pt 

u(x, a- V VTfi I 

40 


(pi{y)Ay - 2 sech" 


X — At + 


6-z r r 

40 Jo 


^i(y)dydz 


(4.14) 
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Figure 8: Results for the example KdV equation. (Left) Comparison to methods, “o”: standard £\, re-weighted €\, rotated £[, 
“o”: re-weighted and iteratively rotated i\. (Right) Comparison to the OMP method, o”: standard OMP, rotated OMP. The dimension is 
d = and the number of unknowns is TV = 1001. 


Since an analytical expression for (pi is available, we can compute the integrals in Eq. (4.14) with high accuracy. 
Denoting 


Ai = ^|Ai I 0/(y)dy, B,- = ^/A, 

the analytical solution is 


i,(y)dydz, i-1,2, ■■■,(/, 


(4.15) 




d 

- cr ^ Ai^i — 2 sech^ 


2 + 



(4.16) 


/=! i=l 

The quantity of interest is chosen to be u{x, f; ^) at x = 6, f = 1 with cr = 0.1, P = 4, and the number of gPC basis 
functions N = 1001. For this example, the combined iterative rotation and re-weighted method outperforms all 
other approaches. However, unlike previous examples the non-rotated re-weighted method works better than 
our iteratively rotated unweighted method. This difference likely arises because c is sparser in this case than in 
others, which makes re-weighted method more efficient. The pattern of sparsity in this case is different than 
previous examples, hence the efficiency of identifying a good rotation matrix A is different. A comparison of c 
and c are presented in Fig. 9, which shows the improvement of the sparsity by the iterative rotation method. 


4.5. Example high-dimensional function 


The previous examples demonstrate the capability of our new method to solve moderately high-dimensional 
problems. In the last example, we illustrate its potential for dealing with higher-dimensional problems. Specially, 
we select a function similar to the first example (Sec. 4.1) but with much higher dimensionality: 

d f d 




d = 100. 


(4.17) 


!=1 V(=I 

The total number of basis functions for this example is N - 5151. The relative error is computed with a level-3 
sparse grid method, hence the numerical integrals are exact. The results are presented in Fig. 10. As before, our 
iterative rotation approach out-performs the existing ti and OMP methods. A comparison of c and c is presented 
in Fig. 11 and it shows the enhancement of the sparsity by the iterative rotation method. 
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Figure 9: Magnitude of the gPC coefficients for example KdV equation. (Left) Magnitude of (Right) Magnitude of of a randomly 
chosen replicate computed by re-weighted and iteratively rotated €[ with M = 120 {MjN » 0.12). These calculations were performed with 
dimension = 10 and the number of unknowns N = 1001. 




Figure 10: Results for the example high-dimensional function. (Left) Comparison with l\ methods, “o”: standard rotated i\ with 1 

iterations, rotated f’l with 2 iterations; “o”: rotated with 3 iterations. (Right) Comparison with OMP methods. rotated OMP with 
1 iterations, rotated OMP with 2 iterations; “o”: rotated OMP with 3 iterations. These calculations were performed with d = 100 and the 
number of unknowns N = 5151. 


For general high-dimensional problems, simply truncating the gPC expansion up to a certain order is not 
efficient because the number of basis function will be very large. For example, in this test, P = 2 requires 5151 
basis functions. Under such conditions even a small M/N = 0.2 needs 1030 samples, which can be difficult in 
practical problems when the computational model is very costly. Hence, a good approach for high-dimensional 
problems is to integrate our iterative rotation method with an adaptive method to reduce N', e.g., adaptive basis 
selection [22] or an ANOVA method [47]). 

4.6. Accuracy of computing the expansion coefficients c 

In many applications, gPC expansions are also used to study the sensitivity of the quantity of interest to the 
input random variables In order to perform this analysis, we need to transform the Ug{t]) - YI^=\ back 

to the original variables Ug{^) = This transformation can be accomplished through inversion of A 

in 1 ] - A^. In Figures 12 and 13, we present the coefficients of Ug{^) in examples 4.1 and 4.4, respectively. In 
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Figure 11: Magnitude of the gPC coefficients for example high-dimensional function. (Left) Magnitude of c^. (Right) Magnitude of c,j of 
a randomly chosen replicate computed by 3 iterated OMP method with M = 1200 {MIN « 0.23). These calculations were performed with 
dimension = 100 and the number of unknowns N = 5151. 



Figure 12: Comparison of gPC coefficients for the example function with equally important random variables (Sec. 4.1). (Left) Coefficients 
calculated by the standard method, “o”: exact |cjl; lc/| by standard /’i method. (Right) Coefficients calculated by our new iteratively 
rotated £[ method with L = 9 iterations. 


both figures we randomly choose one test from the 100 replicates. In Figure 12, we select a result with M — 180 
(MjN ^ 0.4). Using the standard €\ method (left) gives very inaccurate results for c. However, Figure 12 (right) 
shows that the iterative rotation method with L — 9 iterations gives much more accurate results for c. We observe 
the same behavior in Figure 13, where we chose a test with M - 120 {MjN ^ 0.12) for the example KdV equation 
(Sec. 4.4). In order to make this figure legible, we only present those c, with absolute value larger than 10“^. 
This example demonstrates that coefficients c„ with magnitude larger than 10“^ are computed accurately by the 
combined iterative rotation and re-weighted method while the standard obtained significantly less accurate 
c„. This difference is more distinct for \c„\ e [10“"^, 10'^]; in this range our new method compute c„ much more 
accurate than the standard ti method. In the lower right corner of the left plot, the standard (\ method yields many 
Cn which should not appear in that area. As a comparison, we do not see such c„ calculated with the new method. 
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Figure 13: Comparison of gPC coefficients for solutions of the KdV equation (Sec. 4.4). (Left) Coefficients calculated with the standard 
method, “o”: “exact” [c/|; |c/| by standard method. (Right) Coefficients calculated by our new iteratively rotated re-weighted £\ method 

(right). Only Ic/I > 10“^ are presented. 


5. Conclusions 

In this paper, we extend our previous work [ 17] and have introduced a compressive sensing-based gPC method 
to increase the sparsity and accuracy of Hermite polynomial expansion with iterative rotations. Similar to the active 
subspace method [28, 17], the rotation is decided by seeking the directions of maximum variation for the quantity 
of interest. Our current numerical examples are intended to demonstrate the ability of the method to increase the 
sparsity and accuracy of the gPC expansion; therefore, the quantity of interest only relies on the random variables. 
It is also possible to include the physical variables in the basis functions, i.e., u{x\Cntl^nix', (e.g, [16]), 
our future work will explore how this new method may help to increase the sparsity in such cases. 

We have demonstrated the method for minimization and OMP methods but it can also be integrated with 
other compressive sensing methods. In particular, future work will investigate the integration of our new methods 
with advanced sampling strategies (e.g., [20]), adaptive basis selection method (e.g., [22]), Bayesian compres¬ 
sive sensing method (e.g.,[16]), etc. These advanced strategies are particularly important for high-dimensional 
problems. 

With this method, we will also be able to construct an accurate surrogate model of the quantity of interest 
with limited data. Surrogate models are specifically useful for the problems where the experiments or simulations 
are very costly. This surrogate model can be used to study the sensitivity of the parameters and is very useful in 
inverse problems based on Bayesian framework. Our new method requires fewer output data to construct such 
surrogate models, which can be a great savings of experimental or computational resources. 

Finally, we highlight three additional areas of future work for improving the new method. First, it is currently 
only suitable for Hermite polynomial expansions. Second, the new method requires a formal numerical analysis 
to assess convergence behavior and determine specific terminating criteria. Finally, there are likely more optimal 
iterative rotation strategies that can be applied to Hermite polynomial or other expansions. One possible direction 
of work is to design a suitable objective function and consider this problem from an optimization point of view. 
All of these questions will be addressed in our future work. 


18 




Acknowledgments 


This work is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific- 
Computing Research as part of the Collaboratory on Mathematics for Mesoscopic Modeling of Materials (CM4). 
Pacific Northwest National Laboratory is operated by Battelle for the DOE under Contract DE-AC05-76RL01830. 


Appendix 


Here we provide the details of computing the elements (Kij)ki in Eq. (3.6). Notice that for univariate normalized 


Hermite polynomials. 


« e N u (0), 


(A-1) 


where we set = 0 for simplicity. Therefore, we have 


r (A-2) 

where p(^) is the PDE of a standard Gaussian random variable. Eor a multi-index a = {ai,a 2 , ■■ ■ , aj), o', 6 Nu{0}, 
and basis function ''' ^aMd), 

d ‘‘ 

— = ifraX^iY n (A-3) 

m=l 

mi^i 


Hence, given two different multi-indices or^ = ({ak)y,{(^k) 2 ^ ^ (o^k)^) and a/ = ((ai)^,(ai)^, • • • , (a/)^), the corre¬ 

sponding entry of matrix Kij is 


I 


= E 


II (^m) 


m=l 

mi^i 


w 




m-1 

m*i 




n * 

m=l 

j 


(at),,, (or;),,, 
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